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ABSTRACT 

The most commonly used definition of halo formation is the time when a halo's most 
massive progenitor first contains at least half the final mass of its parent. Reasonably 
accurate formulae for the distribution of formation times of haloes of fixed mass have 
been available for some time. We use numerical simulations of hierarchical gravita- 
tional clustering to test the accuracy of formulae for the mass at formation. We also 
derive and test a formula for the joint distribution of formation masses and times. 
The structure of a halo is expected to be related to its accretion history. Our tests 
show that our formulae for formation masses and times are reasonably accurate, so 
we expect that they will aid future analytic studies of halo structure. 

Key viTords: galaxies: clustering - cosmology: theory - dark matter. 



1 INTRODUCTION 

There is a simple analytic approximation for the distribu- 
tion of halo formation times, when formation is defined as 
the time when the most massive progenitor first contains 
at least half the mass of the final object (Lacey & Cole 
1993, 1994). (Throughout, we will use the word parent to 
denote the final object, and the word progenitor to denote 
the smaller pieces which made up the mass of the parent 
at some earlier time.) This formula provides a good descrip- 
tion of what is seen in numerical simulations of gravitational 
clustering from Gaussian initial conditions, although recent 
work indicates that the agreement is not perfect (e.g., Wu 
2001; Lin, Jing & Lin 2003). The sense of the discrepancy 
is that haloes in simulations appear to form slightly earlier 
than predicted, in qualitative agreement with previous work 
by Tormen (1998). 

A related question is, what is the distribution of the 
mass of a halo at formation? Absent other information, nat- 
ural assumptions about this distribution are (i) that it is 
a delta function centered at one-half, or (ii) that the for- 
mation mass is uniformly distributed between one-half and 
unity. The second assumption is motivated by the fact that 
halo formation is expected to be a stochastic process; haloes 
of the same mass may have had different formation histo- 
ries. The main purpose of the present paper is to derive 
and test a formula for the joint distribution of formation 
times and masses. Section 2 studies the distribution of for- 
mation masses whatever the formation time. It shows that 
the distribution of masses just prior to, and just after forma- 
tion, measured in simulations are both significantly different 
from delta functions, or from a uniform distribution, but are 



rather similar to simple formulae for these quantities derived 
by Nusser & Sheth (1999). Section 3 studies the conditional 
distribution of the formation mass, when the formation time 
is known. This distribution is much better fit by a formula we 
derive here, than by a delta function or a uniform distribu- 
tion. A final section summarizes our findings, and discusses 
possible applications. 



2 THE DISTRIBUTION OF FORMATION 
MASSES 

For what follows, it is useful to introduce some notation. 
We will use (5ac(z) to denote the value of the overdensity 
required for spherical collapse at z, extrapolated using linear 
theory to the present time (e.g. Peebles 1993), and cr^(m) 
will denote the variance in the initial density fiuctuation 
field when smoothed with a tophat filter of comoving scale 
R = {Sm/inp)^^^ , extrapolated using linear theory to the 
present time, where p is the comoving background density. 
Thus, the shape of the initial power spectrum determines the 
relation between a and m. At any z, there is a characteristic 
mass scale defined by a^{m) = 5sc{z). We will use M, (z) to 
denote this mass scale, and will often express masses in units 
of this characteristic mass. 

Later in this section we will compare our results with 
simulations; these were kindly made available to the public 
by the Virgo consortium (Frenk et al. 2000) . We will analyse 
results from the set of runs known as the GIF simulations. In 
particular, we will show results from the SCDM and ACDM 
models, for which A = 1 — f2 and {Q,,(Js) = (1,0.6) and 
(0.3, 0.9) respectively. Particle positions and velocities from 
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both simulations were output at a range of redshifts, ap- 
proximately evenly spaced in logarithmic expansion factor: 
Aln(l + « 0.0596. For each output time, we identified 
haloes using the spherical overdensity method (e.g. Lacey & 
Cole 1994; Tormen, Moscardini & Yoshida 2003) which con- 
tained at least twenty particles. The required overdensity is 
a cosmology dependent factor times the background density, 
as specified by the spherical collapse model. For the SCDM 
model, this factor is 178, and it is independent of redshift; 
for the ACDM model, it is 323 a.t z = Q, and is smaller 
at higher redshifts (e.g. Peebles 1993). At any given output 
time z\ , we selected the halos which were composed of more 
than two hundred particles, and studied the formation times 
and masses at formation of these haloes as follows. (For ref- 
erence, an M* halo at 2 = 0, 0.5 and 1.0 has 1289, 170 and 
31 particles in the SCDM run, and 807, 185 and 40 particles 
in the ACDM run, so the high redshift runs mainly probe 
the formation times and masses of objects much larger than 
M..) 

Given a halo of mass M\ (i.e., containing A'^i particles) 
at zi, we go to the previous output time {z-i+Az2, say), iden- 
tify the object which contributes the most number of parti- 
cles to N\, and call it the most massive progenitor at Z\+Az2. 
Suppose this most massive progenitor had N2 particles. We 
then go to the preceding output step {zi + Az2 + Azz, say) 
and identify the most massive progenitor, of N2- We con- 
tinue in this way until the number of particles in the most 
massive progenitor first falls below N\/2. If the mass just 
before formation is A^„, then the mass just after formation 
is Nn-i, and the redshift of formation is zi + ■ ■ ■ + dz„-i. 
We store these values for each halo Mi at zi. 

The main quantity of interest in what follows is p{m, z{), 
the joint distribution of formation masses and times, where 
formation is defined to be the time when one of the sub- 
clumps of a halo first accounts for at least half the final 
mass Ml of its parent. Because of this definition of forma- 
tion, m/Mi is distributed between one-half and unity (recall 
that the mass of the most massive progenitor must exceed 
half the mass of its parent). 

The formation time distribution of haloes which have 
final mass Mi at redshift zi, 



(1) 



p{zi) dzi = J pirn, zi) dm, 

is expected to be well approximated by 



p{zi) Azi = piuj) Auj — Ilo erfc J Auj (2) 

(Lacey & Cole 1993), where = ((5cf - 5^ifl{Si - Si), 
5cf = 5sc(zf), 5ci = 5sc(zi), and Si = o-^(Mi/2). As Lacey 
& Cole note, this formula is only well-behaved for white- 
noise initial conditions (for which (T^(m) oc l/m), although 
it provides a reasonable approximation in the more general 
case. 

The distribution of formation masses, obtained by 
marginalizing over the formation time distribution, is 



p(rn)Am— / p{vn,Zi)Azi. 



(3) 



Nusser & Sheth (1999) describe a model for the evolution 
of the mass of the most massive progenitor which is able to 
reproduce the formation time distribution of equation (2). 




Figure 1. Distribution of scaled formation times in two dif- 
ferent cosmological models, for haloes identified at two differ- 
ent redshifts. In these scaled units, the formation time distribu- 
tion is expected to be independent of halo mass and final time. 
Solid curve shows the precise form which this universal forma- 
tion time distribution is expected to have (equation 2). In all 
panels, squares and hexagons show the simulation results for par- 
ent haloes with masses in the range 4 < Afi/M*(zi) < 8 and 
16 < Mi/Mif{zi) < 32. Simple bars in the panels on the left 
show results for slightly lower halo masses: Mi/Mt(zi = 0) < 2. 
Error bars were estimated assuming Poisson counts. Evidently, 
equation (2) provides a reasonable, but not perfect description of 
halo formation times in the simulations. 



In their model, it is possible to derive an expression for the 
associated formation mass distribution of equation (3). In 
particular, they argue that 



2fi 



fi Afj, 



where 1/2 < /i < 1, 



(4) 



and /i = m/Mi (equation A15 in Nusser & Sheth 1999). 
Strictly speaking this formula, like equation (2), is valid for 
white-noise initial conditions, but Nusser & Sheth argued 
that it should provide a good approximation even if the ini- 
tial spectrum has more large-scale power (see their Fig. A2 
and associated discussion). 

Figure 1 compares the formation time formula, equa- 
tion (2), with measurements in the GIF simulations. (We 
have used the notation ljo.5 to emphasize that formation is 
when the largest progenitor subclump contains at least half 
the mass of the final parent halo. Our requirement that par- 
ent haloes have at least two hundred particles means that 
we only probe the formation statistics of the most massive 
haloes at high redshift. ) Although lower mass haloes identi- 
fied at a given time tend to have formed at higher redshifts 
than more massive haloes (cf. Figure 4 below) , equation (2) 
suggests that, when appropriately rescaled, all dependence 
on mass, time and the shape of the power spectrum should 
be removed. The different panels in the figure show that 
the scaled formation time distributions in the SCDM and 
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Figure 2. The distribution of masses m at formation, for parent 
haloes which have mass Mi at zi = 0. Symbols show the sim- 
ulation results for Mi/M,{zi) < 1 (dots), 2 < Mi/M,(2i) < 4 
(triangles), and A/i/Mt(2i) > 8 (squares). Error bars were esti- 
mated assuming Poisson counts. Curves on the right and the left 
of m/Mi = 1/2 show the distributions in equations (4) and (5) re- 
spectively. There is no obvious trend with Mi , although haloes in 
simulations appear to have m/Mi 1/2 slightly more frequently 
than the model predicts. Results for formation masses of parent 
haloes identified at other redshifts are similar. 

ACDM runs are reasonably, but not perfectly-well described 
by equation (2). 

Our next task is to test the accuracy of the for- 
mation mass formula, equation (4). The symbols in Fig- 
ure 2 show the distribution of masses m at formation for 
haloes in the GIF simulations which have final mass Afi at 
zi = 0; dots show Mi/Mi,{zi) < 1, open triangles show 
2 < Mi/AU{zi) < 4, and squares show Mi/Mt(zi) > 8. 
Error bars were estimated assuming Poisson counts. The 
figure shows no clear trend with Mi. A similar analysis of 
the formation masses, using haloes identified a,t z — 0.5 and 
z = 1, yields similar results. The solid curves which span 
the range 1/2 < m/Mi < 1 in the two panels of Fig. 2 show 
equation (4). Although the formation mass distribution mea- 
sured in the simulations is significantly different from either 
a delta function, or a uniform distribution, equation (4) is 
able to provide a reasonable description of its shape. 

We also studied the mass of the most massive progen- 
itor just before the formation time; these are shown by the 
symbols which span the range 1/4 < m/Mi < 1/2 in the two 
panels. Once again, the measured distribution is neither a 
delta function nor is it uniform. In this case, also, there is a 
simple formula for the distribution of formation masses: 

where 1/4 < /i < 1/2, and jj, = m/Mi as before (equa- 
tion A19 in Nusser & Sheth 1999). The curves which span 
the range 1/4 < m/Mi < 1/2 in the two panels of Fig. 2 




m/Mj 

Figure 3. Same as previous plot, but now shown logarithmically, 
to emphasize the discrepancy near the peak and in the tails. 



show this formula; it provides a reasonable description of 
the measurements in the simulations. 

Although the analytic formulae provide a reasonable de- 
scription of the measurements, haloes in the simulations ap- 
pear to have slightly more occurences of m/Mi ~ 0.45, and 
m/Mi ~ 0.55 than the formulae predict. Some of the dis- 
crepancy may arise because the simulation outputs are not 
spaced arbitrarily closely in time (the typical redshift steps 
are of order Az ~ 0.1). As a result, the measured distri- 
butions almost certainly smooth-out the divergence around 
fj, ~ 1/2. (To better illustrate the behaviour around the 
peak. Figure 3 shows the same distributions, but this time 
on a logarithmic scale.) In principle, the analysis in Nusser 
& Sheth (1999) can be used to estimate this smearing-out 
(their equations A14 and A18 actually depend on the red- 
shift difference), but we believe it will be better to use sim- 
ulations with better time resolution instead, as these will be 
available shortly. 

Some of the discrepancy may be associated with the fact 
that the approach leads to an underestimate of the mean for- 
mation redshift. This discrepancy could plausibly affect the 
formation mass distribution, since, if formation happens at 
higher redshift when the basic building blocks are smaller, 
then the formation masses are less likely have values as large 
as m/Mi ~ 1. Moreover, equations (2)-(5) are derived from 
an approach which predicts fewer massive parent haloes than 
are actually observed in simulations (e.g. Sheth & Tormen 
1999). If the abundance of parent haloes is modified so that 
it is in better agreement with simulations, then the forma- 
tion mass and time distributions will also be modified (for 
reasons made explicit in Sheth & Tormen 2002). Accounting 
for this is left for future work, since the agreement between 
the model and the simulations is quite good. 
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3 CONDITIONAL DISTRIBUTION OF 
FORMATION MASS AND TIME 

The joint distribution of formation mass and time for parent 
haloes with mass Afi at 21 is 

p{m,z)AmAz = Aa I AS p{S, z + A.z\s, z)p{s, z\S\, z{), {&) 
J St 

where s = a'^im), Si = a^{Mi), Sf = a'^{Mi/2), Sm = 



p{s, z\So, zo) ds = — exp{-u/2), 



(7) 



with u = [(5sc(z) — (5sc(zi)]^/(s — Si), and a similar expression 
holds for p(S,z + Az\s,z). When inserted in equation (1), 
equation (6) yields equation (2), and when inserted in equa- 
tion (3) it yields equation (4). 

In the limit of small time steps {Az ^ 1), and a white- 
noise power spectrum, equation (6) simplifies considerably. 
A little algebra shows that, for haloes of fixed mass Mi, the 
conditional distribution of formation masses m when it is 
known that the formation time was Zf is given by 



p(/i|zt) d/i ; 



p{fi, Zf) dfi p{^i) d/i 



exp 



(s-Si) 



P{zi) 



s/Si 



2erfc(tj/V2) 



(8) 



where /i = m/Mi, a = cr^{m), Si = o-^(Mi), Sf = 
(T^(Mi/2), and lo was defined in equation (2). For a white- 
noise spectrum, s/Si ~ and it is straightforward to ver- 
ify that this distribution is correctly normalized. For more 
general power spectra, s/Si ~ m"", say, this conditional dis- 
tribution must be multiplied by a normalization factor which 
depends on a. We have checked that use of the white-noise 
expression is a good approximation to the curves associ- 
ated with Q < 1, provided we insure that the distribution is 
correctly normalized to unity. Thus, although equation (8) 
only holds for a white-noise power spectrum, we expect it 
to be more generally applicable for the same reasons that 
our equations (4) and (5) are more generally applicable. In 
what follows, therefore, we will simply set s/Si = and 
Sf/Si — 2. In this approximation, our expression for the 
conditional distribution of formation masses is independent 
of power spectrum. 

The factor which multiplies p(/i) is largest at s/Si — 1 = 
u^, so objects which form at redshifts which are lower than 
the mean value for that mass (i.e., lu < 1), are expected to 
have formation masses which are biased towards fi ~ 1 (i.e., 
s ~ Si). Conversely, objects which form at abnormally high 
redshifts {uj > 1) are expected to have formation masses 
which are closer to the minimum value allowed: fi ~ 1/2. 
Presumably, this is a consequence of the fact that, to have 
H ~ 1 requires two pieces each of size /i ~ 1/2. In a hi- 
erarchical model, the building blocks available to form the 
parent halo are, on average, smaller at early times: when the 
probability of having an object of mass « 1/2 is small, the 
chance of having two such objects is smaller still. In effect, 
our formula (8) quantifies the importance of this effect. 

Figure 4 shows the joint distribution of formation mass 
and time for parent haloes identified at 21 = in the SCDM 
(top) and ACDM (bottom) simulations. (The stripes are a 
result of the fact that simulation outputs are written to file 
only at finitely many time-steps.) The axis labels use the 
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Figure 4. Joint distribution of formation times and masses mea- 
sured in the simulations. Haloes which form at higher redshifts 
appear to have a smaller spread in formation masses. This is 
quantified in the next figure, which includes a comparison with 
the model predictions. 



notation m/Mi to denote the ratio of the formation mass 
to final mass, Zf the formation redshift, and zi the redshift 
at which the parent object was identified. The two panels 
for each simulation show results for different choices of the 
parent halo mass. Analogous plots for zi — 0.5 and 21 = 1.0 
look very similar, provided we scale the formation time axis 
to Ssc{z{) / Sac{zi) — 1 as we have done, rather than simply 
show Z{. We have chosen to not include them here. (The nat- 
ural rescaling would have been to show uu, defined in equa- 
tion 2, along the x-axis. This would differ from the rescaling 
we show by a factor of (Sf — Si) /Ssc{zi). We chose not to 
scale by this additional factor because one of the points we 
wish to emphasize is that the formation mass formulae turn 
out to be approximately independent of power spectrum.) 

The formation time distribution discussed in the previ- 
ous section is obtained by summing up all haloes with the 
same 2f whatever their value of m/Mi > 1/2. The forma- 
tion mass distributions studied in the previous section were 
obtained by summing up all haloes with the same m/Mi 
whatever their value of 2f. 

Notice that there appears to be a tendency for the ob- 
jects with large 2f to have small values of ra/Mi, but because 
there are many fewer haloes with high formation redshifts, 
it is not obvious if this trend is real, or if it is simply a 
consequence of small-number stastistics. 

To address this in more detail. Figure 5 shows p(/i|zf), 
the conditional distribution of formation masses at fixed for- 
mation time. The plot was made by choosing all haloes with 
masses in the range 1 < Mi/M.t{zi) < 2 at 21 = 0.5, and 
then studying the mass at formation in the subset which 
formed at 2f = 0.61, 1.31 and 2.31. Histograms show the 
measurements in the simulations. Comparison of the differ- 
ent panels shows that the objects which form at higher red- 
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Figure 5. Conditional distribution of masses m at formation, 
given that the mass of the parent halo was in the range 1 < 
Mi/Mj,{zi) < 2 at zi = 0.5, for a range of choices of the red- 
shift of formation (labeled in the middle of each panel). Symbols 
show the measurements in the simulations, and curves show equa- 
tion (8). 



We argued that this was a generic consequence of hierarchi- 
cal formation. 

Our formulae for the joint distribution of formation 
masses and times will find use in studies which attempt to 
relate the structure of a halo to its formation history (e.g. 
Tormen 1997, 1998; Tormen, Diaferio & Syer 1998; van den 
Bosch 2002; Wechsler et al. 2002; Zhao et al. 2003). For 
instance, haloes which formed recently with large forma- 
tion masses are almost certainly further from equilibrium 
than haloes which formed at higher redshift with formation 
masses of order fifty-percent. Such haloes (i.e. ones which 
have suffered major-mergers recently) may plausibly be less 
centrally concentrated than haloes of the same mass which 
had more quiescent accretion histories. Addressing such is- 
sues is the subject of on-going work. If these formulae do 
prove to be useful, it will become necessary to modify them 
slightly so that they are more fully consistent with the par- 
ent halo mass function described by Sheth & Tormen (1999). 

We would like to thank the Aspen Center for Physics 
for support, and for providing the stimulating environ- 
ment in which this work was completed. We would 
also like to thank the Virgo consortium for making 
the simulation data used here publically available at 
http://www.mpa-garching.mpg.de/Virgo. RKS was sup- 
ported by the DOE and NASA grant NAG 5-7092 at Fer- 
milab when work on this project began, and acknowledges 
support from NSF grant AST-0307747. 



shifts have formation masses which are close to 1/2, whereas 
there is an obvious tail of higher formation masses at lower 
formation redshifts. The smooth curves show equation (8); 
it reproduces the trend with formation redshift seen in the 
simulations quite well. We find similar agreement for other 
choices of Mi, z\ and Zi, so we conclude that equation (8) 
provides a reasonable description (by which we mean it is a 
better fit than is a delta function, or a uniform distribution) 
of the conditional distribution of formation mass when the 
formation time is known. 



4 DISCUSSION 

We presented evidence that formulae for the distribution of 
formation masses (equations 4 and 5), were reasonably accu- 
rate (Figure 2). These formulae do not depend on the shape 
of the underlying power spectrum, so they are simple to 
use. We then derived an expression for the conditional dis- 
tribution of formation masses if the formation time is known 
(equation 8), and showed that it was also in quite good 
agreement with measurements made in simulations (Fig. 5). 
Application of Bayes' rule then gives the joint distribution 
of formation mass and time. 

Our results indicate that haloes which form at abnor- 
mally early times are more likely to have formation masses 
of order one-half that of the final mass of the parent, whereas 
haloes which form at abnormally late times are more likely 
to have formation masses which are closer to that of the par- 
ent. One consequence of this is that haloes which form late 
are more likely to have experienced a recent major merger. 
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